TALP at GeoCLEF 2007: Results of a Geographical Knowledge Filtering Approach with Terrier

نویسندگان

  • Daniel Ferrés
  • Horacio Rodríguez
چکیده

This paper describes and analyzes the results of our experiments in Geographical Information Retrieval (GIR) in the context of our participation in the CLEF 2007 GeoCLEF Monolingual English task. Our system uses Linguistic and Geographical Analysis to process topics and document collections. Geographical Document Retrieval is performed with Terrier and Geographical Knowledge Bases. Our experiments show that Geographical Knowledge Bases can be used to improve the retrieval results of the Terrier state-of-the-art IR system by filtering out non geographically relevant documents. 1 System Description Our GIR system is a modified version of the system presented in GeoCLEF 2006 [1] with some changes in the Retrieval modes and the Geographical Knowledge Base. The system has four phases performed sequentially: i) a Linguistic and Geographical Analysis of the topics, ii) a thematic Document Retrieval with Terrier (a state-of-the-art search engine that implements relevance feedback and several retrieval models such as: TFIDF, BM25, and Divergence From Randomness), iii) a Geographical Retrieval task with Geographical Knowledge Bases (GKBs), and iv) a Document Filtering phase. In addition, we have developed a toolbox based on Shape Files 1 for countries, following [2]. A Shape File is a popular geospatial vector data format for geographic information systems software. Shape Files spatially describe geometries: points, polylines, and polygons. In this paper we focus on the analysis of the experiments at GeoCLEF 2007. For a more detailed description of the system architecture, collection processing, and tuning, consult [3].

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

TALP at GeoCLEF 2007: Using Terrier with Geographical Knowledge Filtering

This paper describes our experiments in Geographical Information Retrieval (GIR) in the context of our participation in the GeoCLEF 2007 Monolingual English task. Our system, called TALPGeoIR, follows a similar architecture of our previous system presented at GeoCLEF 2006 [2] with some changes in the Retrieval modes and the Geographical Knowledge Base. The system has four phases performed seque...

متن کامل

TALP at GeoQuery 2007: Linguistic and Geographical Analysis for Query Parsing

This paper describes our experiments on the Geographical Query Parsing pilot-task for English at GeoCLEF 2007. Our system uses some modules of a Geographical Information Retrieval system presented at GeoCLEF 2006 [3] and modified for GeoCLEF 2007. The system uses deep linguistic analysis and Geographical Knowledge to perform the task.

متن کامل

TALP at GeoCLEF-2006: Experiments Using JIRS and Lucene with the ADL Feature Type Thesaurus

This paper describes our experiments in Geographical Information Retrieval (GIR) in the context of our participation in the GeoCLEF 2006 Monolingual English task. The TALPGeoIR system follows a similar architecture of the GeoTALP-IR system presented at GeoCLEF 2005 [2] with some changes in the Retrieval modes and the Geographical Knowledge Base. The system has four phases performed sequentially...

متن کامل

University of Twente at GeoCLEF 2006: Geofiltered Document Retrieval

In this report we describe the approach of the University of Twente to the 2006 GeoCLEF task. It is based on retrieval by content and the subsequent filtering by geographical relevance utilizing a gazetteer. The results do not show an improvement in retrieval performance when taking geographical information into account.

متن کامل

The GeoTALP-IR System at GeoCLEF-2005: Experiments Using a QA-based IR System, Linguistic Analysis, and a Geographical Thesaurus

This paper describes GeoTALP-IR system, a Geographical Information Retrieval (GIR) system. The system is described and evaluated in the context of our participation in the CLEF 2005 GeoCLEF Monolingual English task. The GIR system is based on Lucene and uses a modified version of the Passage Retrieval module of the TALP Question Answering (QA) system presented at CLEF 2004 and TREC 2004 QA eval...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007